svcR: An R Package for Support Vector Clustering improved with Geometric Hashing applied to Lexical Pattern Discovery
نویسنده
چکیده
We present a new R package which takes a numerical matrix format as data input, and computes clusters using a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to speed up cluster extraction. In this sense, SVC can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a Jaccard-Radial base kernel can help to classify well enough a set of terms into ontological classes and help to define regular expression rules for information extraction in documents; our case study concerns a set of terms and documents about developmental and molecular biology.
منابع مشابه
svcR: a package for Support Vector Clustering improved with Geometric Hashing. Application to Lexical Pattern Discovery
We developed an R toolkit to manage data described by attributes, able to make clusters with a support vector clustering method (SVC). We have implemented an original 2D-grid labeling approach to extract clusters to optimize time processing. In this sense, svc can be seen as an efficient cluster extraction if clusters are separable in a 2-D map. Secondly we showed that this SVC approach using a...
متن کاملA Regularized Nonsmooth Newton Method for Multi-class Support Vector Machines
Multi-class classification is an important and on-going research subject in machine learning. Recently, the ν-K-SVCR method was proposed by the authors for multi-class classification. Since many optimization problems have to be solved in multi-class classification, it is extremely important to develop an algorithm that can solve those optimization problems efficiently. In this paper, the optimi...
متن کاملDetection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods
Introduction: In this paper, a method is presented to classify the breast cancer masses according to new geometric features. Methods: After obtaining digital breast mammogram images from the digital database for screening mammography (DDSM), image preprocessing was performed. Then, by using image processing methods, an algorithm was developed for automatic extracting of masses from other norma...
متن کاملDetection and Classification of Breast Cancer in Mammography Images Using Pattern Recognition Methods
Introduction: In this paper, a method is presented to classify the breast cancer masses according to new geometric features. Methods: After obtaining digital breast mammogram images from the digital database for screening mammography (DDSM), image preprocessing was performed. Then, by using image processing methods, an algorithm was developed for automatic extracting of masses from other norma...
متن کاملRegularized nonsmooth Newton method for multi-class support vector machines
Multi-class classification is an important and on-going research subject in machine learning. Recently, the ν-K-SVCR method was proposed by the authors for multi-class classification. Since many optimization problems have to be solved in multi-class classification, it is extremely important to develop an algorithm that can solve those optimization problems efficiently. In this paper, the optimi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1504.06080 شماره
صفحات -
تاریخ انتشار 2010